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AMENDMENT TO THE CLAIMS 

1. (Original) A method of compressing a log of linguistic data, 
the log having a plurality of linguistic strings, each string being 
including at least two tokens, the method comprising: 

applying a compression operation to each string; 
determining if any two strings match each other after the 

compression operation; and 
removing one of the two matching strings from the log. 

2. (Original) The method of claim 1, wherein the log is. a log of 
queries . 

3. (Original) The method of claim 2, wherein the queries are 
queries relative to a help function. 

4. (Original) The method of claim 3, wherein the help-related 
queries are relative to a computer system. 

5. (Original) The method of claim 1, wherein the compression 
operation is character-based. 

6. (Original) The method of claim 1, wherein the compression 
operation is token-based. 

7. (Original) The method of claim 1, wherein the compression 
operation is subsumption. 

8. (Original) The method of claim 7, wherein subsumption includes 
applying an impossibility condition to selectively compute edit 
distance . 

9. (Original) The method of claim 1, and further comprising: 
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applying a second compression operation to each string; 
determining if any two strings match each other after the 

second compression operation; and 
removing one of the two matching strings from the log. 

10. (Original) The method of claim 9, wherein the first compression 
operation is character-based and the second compression operation 
is token based. 

11. (Original) -The method of claim 10, and further comprising 
applying subsumption after the second compression operation is 
complete . 

12. (Original) The method of claim 11, wherein the subsumption 
operation is repeated for the log. 

13. (Original) The method of claim 1, and further comprising 
training a statistical process with the compressed log. 

14. (Original) A system for compressing a query log having a 
plurality of linguistic strings, each string having a plurality of 
tokens, the system comprising: 

an input for receiving a raw query log; 
memory for storing the raw query log; 

a processor for applying at least one compression 
operation to each string, and for scanning the 
modified strings to determine if any match each 
other so that one of the matching strings can be 
removed ; and 

an output for providing a compressed query log once the 
removal is complete. 



15. (Original) The system of claim 14, wherein the queries are 
queries relative to a help function. 

16. (Currently Amended) The system of claim 15-3r4, wherein the 
help-related queries are relative to a computer system. 

17. (Original) The system of claim 14, wherein the at least one 
compression operation is character-based. 

18. (Original) The system of claim 14, wherein the at least one 
compression operation is token-based. 

19. (Original) The system of claim 14, wherein the at least one 
compression operation includes subsumption. 

20. (Original) The system of claim 19, wherein subsumption includes 
applying an impossibility condition to selectively compute edit 
distance . 

21. (Original) The system of claim 14, and further comprising: 

applying at least a second compression operation to each 
string; 

determining if any two strings match each other after the 

second compression operation; and 
removing one of the two matching strings from the log. 

22. (Original) The system of claim 21, wherein the first 
compression operation is character-based and the second compression 
operation is token based. 

23. (Original) The system of claim 22, and further comprising 
applying subsumption after the second compression operation is 
complete. 
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24. (Original) The system of claim 23, wherein the subsumption 
operation is repeated for the log. 

25. (Original) The system of claim 14, and further comprising 
training a statistical process with the compressed log. 



